Parallel Spectral Clustering
نویسندگان
چکیده
Spectral clustering algorithm has been shown to be more effective in finding clusters than most traditional algorithms. However, spectral clustering suffers from a scalability problem in both memory use and computational time when a dataset size is large. To perform clustering on large datasets, we propose to parallelize both memory use and computation on distributed computers. Through an empirical study on a large document dataset of 193, 844 data instances and a large photo dataset of 637, 137, we demonstrate that our parallel algorithm can effectively alleviate the scalability problem.
منابع مشابه
Parallel Spectral Clustering Algorithm Based on Hadoop
Spectral clustering and cloud computing is emerging branch of computer science or related discipline. It overcome the shortcomings of some traditional clustering algorithm and guarantee the convergence to the optimal solution, thus have to the widespread attention. This article first introduced the parallel spectral clustering algorithm research background and significance, and then to Hadoop t...
متن کاملParallel Spectral Clustering Algorithm for Large-Scale Community Data Mining
The spectral clustering algorithm has been shown to be very effective in finding clusters of non-linear boundaries. Unfortunately, spectral clustering suffers from the scalability problem in both memory use and computational time. In this work, we parallelize the algorithm by dividing both memory use and computation on distributed machines. Empirical study on some small datasets shows the accur...
متن کاملSegmentation of cDNA Microarray Images using Parallel Spectral Clustering
Microarray Image Microarray technology generates large amounts of expression level of genes to be analyzed simultaneously. This analysis implies microarray image segmentation to extract the quantitative information from spots. Spectral clustering is one of the most relevant unsupervised methods able to gather data without a priori information on shapes or locality. We propose and test on micro...
متن کاملPSC: Parallel Spectral Clustering
Spectral clustering algorithm has been shown to be more effective in finding clusters than some traditional algorithms such as k-means. However, spectral clustering suffers from a scalability problem in both memory use and computational time when the size of a data set is large. To perform clustering on large data sets, we investigate two representative ways of approximating the dense similarit...
متن کاملComparing k-means clusters on parallel Persian-English corpus
This paper compares clusters of aligned Persian and English texts obtained from k-means method. Text clustering has many applications in various fields of natural language processing. So far, much English documents clustering research has been accomplished. Now this question arises, are the results of them extendable to other languages? Since the goal of document clustering is grouping of docum...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008